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Abstract 

Financial correlations play a central role in financial theory and also in many 
practical applications. From theoretical point of view, the key interest is in 
a proper description of the structure and dynamics of correlations. From 
practical point of view, the emphasis is on the ability of the developed models 
to provide the adequate input for the numerous portfolio and risk management 
procedures used in the financial industry. This is crucial, since it has been long 
argued that correlation matrices determined from financial series contain a 
relatively large amount of noise and, in addition, most of the portfolio and risk 
management techniques used in practice can be quite sensitive to the inputs. 
In this paper we introduce a model (simulation)-based approach which can 
be used for a systematic investigation of the effect of the different sources of 
noise in financial correlations in the portfolio and risk management context. To 
illustrate the usefulness of this framework, we develop several toy models for 
the structure of correlations and, by considering the finiteness of the time series 
as the only source of noise, we compare the performance of several correlation 
matrix estimators introduced in the academic literature and which have since 
gained also a wide practical use. Based on this experience, we believe that our 
simulation-based approach can also be useful for the systematic investigation 
of several other problems of much interest in finance. 
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1 Introduction 



Correlation matrices of financial returns play a crucial role in several branches of mod- 
ern finance such as investment theory, capital allocation and risk management. For 
example, financial correlation matrices are the key input parameters to Markowitz's 
classical portfolio optimization problem il], which aims at providing a recipe for the 
selection of a portfolio of assets such that risk (quantified by the standard deviation 
of the portfolio's return) is minimized for a given level of expected return. For any 
practical use of the theory it would therefore be necessary to have reliable estimates for 
the correlations of returns (of the assets making up the portfolio), which are usually 
obtained from historical return series data. However, if one estimates a n x n correla- 
tion matrix from n time series of length T each, since T is usually bound by practical 
reasons, one inevitably introduces estimation error, which for large n can become so 
overwhelming that the whole applicability of the theory may become questionable. 

This difficulty has been well known by economists for a long time (see e.g. [2] and 
the numerous references therein). Several aspects of the effect of noise (in the cor- 
relation matrices determined from empirical data) on the classical portfolio selection 
problem has been investigated e.g. in refs. [3]. One way to cope with the problem of 
noise is to impose some structure on the correlation matrix, which may certainly intro- 
duce some bias in the estimation, but by reducing effectively the dimensionality of the 
problem, could be in fact expected to improve the overall performance of the estima- 
tion. The best-known such structure is that imposed by the single-index (or market) 
model, which has gained a large interest in the academic literature (see e.g. |2| for an 
overview and references) and has also become widely used in the financial industry (the 
coefficient "beta", relating the returns of an asset to the returns of the corresponding 
wide market index, has long become common talk in the financial community). On 
economic or statistical grounds, several other correlation structures have been experi- 
mented with in the academic literature and financial industry, for example multi-index 
models, grouping by industry sectors, macroeconomic factor models, models based on 
principal component analysis etc. Several studies (see e.g. refs. [4J) attempt to compare 
the performance of these correlation estimation procedures as input providers for the 
portfolio selection problem, although all these studies have been somewhat restricted to 
the use of given specific empirical samples. More recently, other procedures to impose 
some structure on correlations (e.g. Bayesian shrinkage estimators) or bounds directly 
on the portfolio weights (e.g. no short selling) has been explored, see e.g. refs. The 
general conclusion of all these studies is that reducing the dimensionality of the prob- 
lem by imposing some structure on the correlation matrix may be of great help for the 
selection of portfolios with better risk-return characteristics. 

The problem of estimation noise in financial correlation matrices has been put into 
a new light by [HI IE] from the point of view of random matrix theory. These studies 
have shown that empirical correlation matrices deduced from financial return series 
contain such a high amount of noise that, apart from a few large eigenvalues and the 
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corresponding eigenvectors, their structure can essentially be regarded as random. In 
[Zj, e.g., it is reported that about 94% of the spectrum of correlation matrices deter- 
mined from return series of the S&P 500 stocks can be fitted by that of a random 
matrix. Furthermore, two subsequent studies Elj have shown that the risk-return 
characteristics of optimized portfolios could be improved if prior to optimization one 
filtered out the lower part of the eigenvalue spectrum of the correlation matrix in an 
attempt to remove (at least partially) the noise, a procedure similar to principal compo- 
nent analysis. Other approaches inspired from physics and that are aimed to be useful 
in extracting information from noisy correlation data have been introduced in [TT| I12j. 
It is important to note that all the above studies have used (given) empirical datasets, 
which in addition to the noise due to the finite length of the time series, contain also 
several other sources of error (caused by non-stationarity, market microstructure etc.). 

The motivation of our previous study ^3] came from this context. In order to get rid 
of these additional sources of errors, we based our analysis on data artificially generated 
from some toy models. This procedure offers a major advantage in that the "true" 
parameters of the underlying stochastic process, hence also the correlation matrix is 
exactly known. The key observation of 13] is that the effect of noise strongly depends 
on the ratio T j n, where n is the size of the portfolio while T is the length of the available 
time series. Moreover, in the limit n — ► oo, T — > oo but T/n = const, the suboptimality 
of the portfolio optimized using the "noisy" correlation matrix (with respect to the 
portfolio obtained using the "true" matrix) is 1/y^l — n/T exactly. Therefore, since 
the length of the time series T is limited in any practical application, any bound one 
would like to impose on the effect of noise translates, in fact, into a constraint on the 
portfolio size n. 

The aim of this paper is (besides to extend the analysis of the previous study) to 
introduce a model (simulation) -based approach that can be generally used for the sys- 
tematic investigation of correlations in financial markets and for the study of the effect 
of different sources of noise on the numerous procedures based on correlation matrices 
extracted from financial data. As an illustration of the usefulness of this approach, we 
introduce several toy models aimed to progressively incorporate the relevant features of 
real-life financial correlations and, in the world of these models, we study the effect of 
noise (in this case only due to the estimation error caused by the finiteness of surrogate 
time series generated by the models) on the classical portfolio optimization problem. 
More precisely, we compare the performance of different correlation matrix "estima- 
tion" methods (e.g. the filtering procedure introduced in [01 El!) in providing inputs for 
the selection of portfolios with optimal risk-return characteristics. The approach is in 
fact very common in physics, where one starts with some bare model and progressively 
adds finer and finer elements, while studying the behavior of the "world" embodied by 
the model by comparing it to the real-life (experimental) results. We strongly believe 
that our model-based approach can be useful for the systematic study of several other 
problems in which financial correlation matrices play a crucial role. 
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2 Results and Discussion 



We keep to consider the following simplified version of the classical portfolio optimiza- 
tion problem introduced in [T3] : the portfolio variance Y^i,j=i w i°'ij w j * s minimized 
under the budget constraint Yh=i w i — 1> where Wi denotes the weight of asset i in 
the portfolio and the covariance matrix of returns. This simplified form provides 
the most convenient laboratory for testing the effect of noise in correlations, since it 
eliminates the additional uncertainty arising from the determination of several other 
parameters that appear in more complex formulations. The weights of the optimal 
portfolio in this simple case are: 

y^ri — 1 

< = ir 1 > (i) 

2-jj,k=l a jk 

Starting from a given "true" covariance matrix aff (n x n) we generate surrogate 
time series y it (of finite length T), y it = Y%=i Xj t , with Xj t ~ i.i.d. N(0, 1) and 
the Cholesky decomposition of the matrix er>° In this way we obtain "return series" yu 
that have a distribution characterized by the "true" covariance matrix affl. Similar to 
real-life situations (where the true covariance matrix is not known) we calculate differ- 
ent "estimates" for the covariance matrix based on several competing procedures 
and then use these estimates in our portfolio optimization. Finally, we compare the 
performance of these procedures using metrics related to the risk (standard deviation) 
of the "optimal" portfolios constructed based on the corresponding estimates. The 
main advantage of this simulation-based approach is that the "true" covariance matrix 
can be incorporated in the evaluation, which is certainly much cleaner than using, as 
in empirical studies, some proxy for it (which in turn introduces an additional source 
of noise). 

(0) 



In our previous study [T3] we have used a very simple structure ("model") for a. 
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(namely the identity matrix) and we have studied the effect of noise when the "es- 
timated" matrix crj-- is the sample (or historical) covariance matrix. In this paper 
we introduce several other "models" (proposals for the structure of ) which are 
intended to incorporate progressively the most relevant characteristics of real-life fi- 
nancial correlations (the models are given in terms of the corresponding correlation 
matrix p\f): 

1. " Single- index" , "market" or "average correlation" model. The correlation matrix 
has 1 in the diagonal and po given (0 < po < 1) off-diagonal (all correlations the 
same, hence the name of "average correlation" model). The eigenstructure of 
such a matrix is formed of one large 1 eigenvalue with corresponding eigenvector 
in the direction of (1, 1, . . . , 1) and an — 1-fold degenerated eigenvalue subspace 
orthogonal on the subspace of the first eigenvector. The eigenspace of the large 



: Ai = 1 + (n — l)po, which for the usual values of the parameters is large compared to A2 = A3 
= A n = 1 - p . 
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eigenvalue can be thought of as describing correlations with a broad " index" com- 
posed of all stocks (the "market"), hence the name of "single-index" or "market" 
model. This model is motivated by the similar salient feature of stock market 
correlations found by numerous research studies (see e.g. [2] for references). 

2. " Market +sectors" model. A very simple structure intended to incorporate this 
more debated 2 feature of real-life financial correlations can be based on a correla- 
tion matrix composed of n\ xni blocks (with 1 in the diagonal and pi off-diagonal) 
and p outside the blocks (0 < p < Pi < 1 an d — integer). In this model there 
is still a strong influence of the "market" but stocks from the same block ("in- 
dustrial sector") display additional common correlations. On the other hand, 
the eigenspectrum of such a matrix 3 is closer to the eigenspectrum of real-life 
financial correlation matrices as described e.g. in jTUj. This correlation structure 
also fits better with the findings of E] , which using a hierarchial tree ap- 
proach found also that stocks tend to be coupled according to their belonging to 
industrial sectors. 

3. "Semi-empirical" (bootstrapped) model. Starting from a large set of empirical 
financial data 4 for each portfolio size n, we select randomly (bootstrap) n time 
series from the set of empirical return data and an n x n covariance matrix is 
calculated using the full length of the available series. This matrix is then used as 

in the simulations (to generate the surrogate data). In order to examine the 
sensitivity of our results with respect to the choice of the n time series, we repeat 
several times the simulations (with different bootstrapped empirical series) and 
we compare the results. The correlation structure of this model is hoped to be 
the closest to real-world financial correlations, although the disadvantage of it 
is, similar to empirical studies, that it is based on a given set of empirical data 
which might be representative in certain situations but it is still not fully general. 

In the framework of each of the models introduced above, we investigate the per- 
formance of three alternative choices for the "estimated" covariance matrix aj] : 

1. Sample (historical) covariance matrix. 

2. " Single- index" covariance matrix, i.e. the matrix obtained from the sample co- 
variance matrix by a simplified filtering procedure similar to the one described 
below, but considering only the largest eigenvalue (and the corresponding eigen- 
vector), which is believed to correspond to a broad market index covering all 
stocks, see e.g. [TU] . 

2 See e.g. refs. [H]. 

3 The eigenstructure is formed of a large eigenvalue Ai = 1 + (m — l)pi + (n — nx)po, a ^- — 1-fold 
degenerated subspace corresponding to medium-size eigenvalues A2 = A3 = . . . = A^_ = 1+(tii— l)pi — 
nipo and an— ^--fold degenerated subspace with eigenvalues A_"_ + i = A_n_ + 2 = • • • = A„ = 1 — p\. 

4 The same dataset as in ^3] nas been used. We thank again J. -P. Bouchaud and L. Laloux for 
making their data available to us. 
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3. Filtered covariance matrix using the procedure based on random matrix theory 
|Hl CDj • For this, one starts with the sample correlation matrix and keeps only 
the eigenvalues and the corresponding eigenvectors reflecting deviations from ran- 
dom matrix theory predictions (those outside the random matrix noise-band) and 
then constructs a "cleaned" correlation matrix such that the trace of the matrix 
is preserved. The intuition behind this procedure is that deviations from random 
matrix theory predictions should correspond to "information" and describe gen- 
uine correlations in the system while the eigenstates corresponding to random 
matrix theory predictions should be manifestations of purely random "noise". 
The filtered covariance matrix is then obtained from the filtered correlation ma- 
trix and sample standard deviations. This procedure is very much reminiscent of 
principal component analysis, although classical multivariate analysis gives gen- 
erally no hints about how many components (factors) to include in the matrix 
constructed using the principal components (see e.g. [IS]). The filtering proce- 
dure based on random matrix theory can therefore be thought of as a theoretically 
sound indication for the number of principal components to be included in the 
analysis. 

To study the effect of noise on the portfolio optimization problem we use metrics 
based on the following quantities: 

1- YJij=\ w i°^* vfj 1 w f^*i the "true" risk of the optimal portfolio without noise, where 

* denotes the solution to the optimization problem with cr|° , 

2- YJij=iWi a\j w)j , the "true" risk of the optimal portfolio determined in the 
case of noise, where w\ denotes the solution to the optimization problem with 

3- £&=i wf } * <r\f wf ) *, the "predicted" risk (cf. 01101113), that is the risk that 
can be observed when the optimization is based on the "empirical" series; 

4- E?j=i w f ] *i the "realized" risk (cf. 011111113]), that is the risk that 

(2) 

would be observed if the portfolio were held one more "period", where is the 
covariance matrix calculated from the returns in this second period. 

To facilitate the comparison, we calculate the ratios of the square roots of the three 
latter quantities to the first one, and denote these by go, Qi and g 2 , respectively. That 
is q , qi and g 2 represent the "true", the "predicted" resp. the "realized" risk, expressed 
in units of the "true" risk in the absence of noise. In other words, go describes directly 
the ability of a given estimation procedure to provide the correct input for portfolio 
optimization, gi describes the bias one makes if then uses the estimated matrix for 
the calculation of the risk of the optimal portfolio, while g 2 is the risk measured if one 
waits in time and uses the information from the new series for risk measurement (see 
also [13). 
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We start with presenting the simulation results when the series have been generated 
using the "market" model (for cr^). Since the main feature of the correlation structure 
(one outstanding large eigenvalue) is, at least for the parameter values used in our 
simulations, preserved also in the correlation matrix obtained from the generated series 
(c^ 1 ), the results for the filtering based on the largest eigenvalue and on random matrix 
theory are in fact the same. Therefore, we proceed with comparing the performance 
of the historical and filtered estimation procedures for different values of the model 
parameters n, T and po using the evaluation metrics qo, q±, q2 and qi/ox- A summary 
of our simulation results is presented in Table [T] 

Table 1: Optimal portfolio risk and performance indicators for the historical (h) and market 
(to) correlation matrix estimators for different values of the parameters of the model (<r|? ) ■ 



Po 


n 


T 


T/n 


W 
% 


(m) 

Qo 


(h) 
Qi 


m 

Qi 


& 


(m) 




/ ( m ) 


0.2 


200 


300 


1.5 


1.77 


1.11 


0.56 


0.78 


1.77 


1.13 


3.16 


1.46 


0.2 


1000 


1500 


1.5 


1.73 


1.12 


0.59 


0.78 


1.71 


1.11 


2.96 


1.42 


0.6 


1000 


1500 


1.5 


1.75 


1.11 


0.58 


0.77 


1.75 


1.12 


3.01 


1.45 


0.2 


1000 


2000 


2 


1.42 


1.11 


0.71 


0.82 


1.43 


1.11 


2.00 


1.35 


0.2 


1000 


5000 


5 


1.11 


1.07 


0.89 


0.91 


1.12 


1.07 


1.26 


1.18 


0.2 


1000 


500 


0.5 




1.12 




0.57 




1.12 




1.92 



It turns out that, for sufficiently large n and T, the value of the g's depends strongly 
only on T/n (and, interestingly, does not seem to depend on p ). This can be seen also 
from the results presented in Table Q (the variation in the first 3 rows is in fact within 
the usual standard deviation bounds). This is not very surprising as concerning the 
results for the historical matrix, which has been studied in our previous paper ^3]- The 
strong dependence on T/n seems to be valid, however, also when the filtered matrix 
is used. One important difference to note is, however, the significant improvement in 
the risk characteristics of the optimal portfolio when the filtering procedure is used 
for estimation, e.g. for T/n = 2 instead of obtaining a portfolio with risk more than 
40% larger than the trully optimal one (see qo), using the filtering procedure one can 
get portfolios with risk only 10% larger. Furthermore, as it can also be seen from the 
table, using the filtered matrix one can obtain portfolios close to the optimal one even 
for T < n when the sample (historical) matrix is singular and not at all appropriate 
for being used in the optimization. This improvement in performance is not difficult 
to understand, since with the filtering procedure one implicitly incorporates into the 
"estimation" the additional information about the structure of the correlation matrix. 
Note also that for all parameter values q2 is very close to qo, therefore the risk measured 
in the second "period" seems to be a good proxy for the "true" risk of the optimal 
portfolio. 

We next present the results when the series are generated with the " market+sectors" 
model, for different values of the parameters n, T, m, po and p\. Our results are sum- 
marized in Table El The values for q^s have been again very close to go an d therefore 
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Table 2: Optimal portfolio risk and performance indicators for the historical (h), market 
(to) and random matrix theory (r) correlation matrix estimators for different values of the 
parameters of the model (<r|^). 



Po 


Pi 


m 


n 


T 


(h) 
% 


(m) 
% 


(r) 
% 


(h) 


(m) 


(r) 


/ CO 


/ ( m ) 
Q2/Q1 


l2/q[ r) 


0.2 


0.4 


25 


200 


300 


1.71 


1.27 


1.13 


0.58 


0.77 


0.76 


2.93 


1.65 


1.47 


0.2 


0.4 


25 


1000 


1500 


1.75 


1.28 


1.13 


0.58 


0.77 


0.76 


3.07 


1.63 


1.46 


0.2 


0.6 


25 


1000 


1500 


1.74 


1.64 


1.13 


0.59 


0.78 


0.76 


2.94 


2.09 


1.47 


0.4 


0.6 


25 


1000 


1500 


1.73 


1.36 


1.13 


0.58 


0.77 


0.76 


2.96 


1.77 


1.49 


0.2 


0.4 


50 


1000 


1500 


1.71 


1.42 


1.12 


0.58 


0.77 


0.77 


2.96 


1.84 


1.46 


0.2 


0.4 


25 


1000 


2000 


1.42 


1.24 


1.12 


0.70 


0.82 


0.81 


1.99 


1.50 


1.37 


0.2 


0.4 


25 


1000 


5000 


1.11 


1.16 


1.07 


0.89 


0.91 


0.90 


1.24 


1.27 


1.17 


0.2 


0.4 


25 


1000 


500 




1.24 


1.19 




0.58 


0.55 




2.14 


2.17 
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1.4 
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Fig. 1: qo as a function of T/n for different values of the parameters po, p\ and n\ and 
different values of n and T. In the case of the historical and random matrix theory estimator 
(h and r, resp.) the points line up approximately on a line (solid and dotted, resp.). For the 
market estimator (to), however, the dependence on virtually all the parameters is clear from 
the figure (e.g. the increase in either p\ or n.\ leads to the increase of qo). 
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have been left out from the table. We have found that the value of the g's in the case 
of the historical and random matrix theory-based estimators, again, depends strongly 
on T/n and not on the value of the other parameters, while this is not true for the 
estimator based on the largest eigenvalue only. This is illustrated in Fig. ^ where go i n 
the case of the three estimators is represented as a function of T/n for different value 
of the parameters n, T, ni, po and p\. The dependence of go for the " single- index" 
estimator on the parameters po, p\ and n\ can be easily understood, since either the in- 
crease of pi or m, or the decrease of po can be thought of as the increase in the relative 
strength of "inter-sector" correlations (relative to the overall correlation corresponding 
to the "market") and therefore an estimator taking into account only the "market" 
component of correlations (and ignoring the "sector" component) is of course expected 
to perform worse is this case. Another important point to note is that, in most cases, 
the random matrix theory based filtering outperforms the single-index estimator which 
in turn outperforms the historical estimator. Moreover, the first two estimators can 
be used even when the latter one provides a singular matrix totally inappropriate for 
input to the portfolio optimization (for T < n). 

Finally, we analyze the performance of the three correlation matrix estimators in 
the case of the "semi-empirical" model for aff (the matrix is bootstrapped from the 
empirical matrix of a given large set of financial series). More precisely, for each value 
of the parameter n, we select at random n series from the available dataset and we 
calculate the historical matrix which is then used as in our simulations 5 . Our 
results are summarized in Table El (the values for g 2 's have been again left out of the 
table.) In this case, the g's for the two filtering matrix estimations do not depend so 
strongly on T/n, some dependence on n (and T) can also be observed (see Fig. EJ). 
It can be said again that, in general, the filtering procedures outperform significantly 
the historical matrix estimation, with the filtering based on the random matrix theory 
approach performing the best. 



Table 3: Optimal portfolio risk and performance indicators for the historical (h), market 
(m) and random matrix theory (r) correlation matrix estimators for different values of the 
parameters of the model 



n 


T 


T/n 


(h) 
Qo 


(ml 

Qo 


r 

Qo 


CO 


m 

Qi 


q\ 


/ w 


/ ( m ) 


/ (r) 


200 


300 


1.5 


1.70 


1.30 


1.20 


0.58 


0.78 


0.83 


3.03 


1.67 


1.44 


300 


450 


1.5 


1.74 


1.48 


1.24 


0.58 


0.76 


0.84 


2.99 


1.94 


1.45 


300 


600 


2 


1.41 


1.50 


1.21 


0.71 


0.77 


0.90 


2.02 


1.95 


1.35 


300 


1500 


2 


1.12 


1.53 


1.15 


0.89 


0.80 


0.96 


1.26 


1.92 


1.21 


300 


150 


0.5 




1.41 


1.33 




0.76 


0.73 




2.02 


1.85 



5 Since most of the values for the length T of the time series used in our simulations is small 
compared to the lengths of the original dataset from which af^ is computed, the noise due to the 

" measurement error" of aff can be hoped to be small compared to the noise (deliberately) introduced 
by the fmiteness of T. 
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Fig. 2: go as a function of T/n for different values of n and T. In the case of the historical 
estimator (h) the points line approximately on a line. For the market and random matrix 
theory estimator (m and r, resp.), however, the dependence on n and T is clear from the 
figure. 

In conclusion, our simulation study provides a more general argument for the use- 
fulness of techniques for "massageing" empirical correlation matrices before using them 
as inputs for portfolio optimization as suggested e.g. by |U El EE UHj ■ Furthermore, it 
re-emphasizes the fruitfulness of the random matrix theory-based filtering procedure 
for portfolio selection applications. 

There are several possibilities to extend the analysis of this paper. One main di- 
rection would be to develop "models" that incorporate more subtle features of real-life 
financial correlations. For example, an important feature of real financial series that 
has been neglected is non-stationarity. Incorporating the dynamics of correlations 
into the model could result into a more realistic description of correlations. For ex- 
ample, models such as ARCH/GARCH and its numerous variants (see e.g. ^B] for 
an overview) have been found to be fruitful in describing the dynamics of changing 
volatility (and also of correlations in the multivariate setting). On the other hand, 
estimation techniques based on similar rationales (for example RiskMetrics [£l\) have 
been widely utilized by financial practitioners. These estimation procedures run into 
the dimensionality problem typically already for n = 4 or 5, but fortunately the princi- 
pal component /factor approach has proved here also useful [TH|. A simple way to take 
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account of non-stationarity in our "estimation" would be to use exponential weighting 
of observations in the calculation of the correlation matrix (in the spirit of Risk- 
Metrics) and then apply the filtering to this matrix. Of course, this should be preceded 
by the derivation of the corresponding formulae for the noise band of matrices with this 
new structure. Another way to extend the analysis of this study is to use the model 
(simulation)-based approach for evaluating the performance of several other correlation 
matrix estimators introduced in the literature or used in practice. 

The implications of successful noise filtering in correlation matrices used for port- 
folio optimization are enormous. Correlation matrices are not only at the heart of 
modern finance and investment theory, but also appear in most practical risk manage- 
ment and asset allocation procedures used in the financial industry. In particular, most 
implementations of practical risk-return portfolio optimization or benchmark tracking 
(minimization of risk with respect to a given benchmark) involve either correlation 
matrices or "scenarios" usually generated using correlation matrices, see e.g. ^H]. A 
short overview on the techniques used by practitioners for reducing noise and estima- 
tion error in correlation matrices can be found in [20] • The filtering procedure based on 
random matrix theory fits well into this package and can prove very useful for reducing 
estimation error and its consequences. On the other hand, from purely academic point 
of view, understanding the structure and dynamics of correlations in financial markets 
is still of central interest in finance and related fields, therefore any study that makes 
it possible to reveal finer and finer bits of the structure of these correlations can be of 
great importance. 

3 Conclusion 

In this paper we introduced a model (simulation)-based approach which can be used 
for a systematic investigation of the effect of different sources of noise in correlation 
matrices determined from financial return series. To show the usefulness of this ap- 
proach we developed several toy models for the structure of financial correlations and, 
by considering only the noise arising from the finite length of the model-generated time 
series, we analyzed the performance of several correlation matrix estimation procedures 
in a simple portfolio optimization context. 

The results of this study can be extended in very numerous ways, some of which 
are briefly given next. First, by developing models that incorporate finer and finer 
elements of the structure of financial correlations, the relevance of the results can 
be increased further. For example, allowing for some dynamics (non-stationarity) in 
correlations could make it possible to analyze the effect of noise due to non-stationarity 
or due to the estimation error of the parameters of some dynamic models on the 
portfolio optimization problem. Second, the analysis could be extended to several 
other correlation estimation procedures introduced in the literature, e.g. trully single- 
index model (with betas), multi- index models, different factor estimation procedures, 
Bayesian-estimators etc. (see for example [U El UHl ED])- The models (simulations) 
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could also be used for studying the performance of several other techniques for the 
extraction of correlation information such as the hierarchial tree methods of [TT| I12j. 
Third, our model-based approach can be used also in a more complex optimization 
framework, e.g. in that of the classical mean-variance efficient frontier rather than just 
in the simple global optimization framework used in this paper. Last, but not least, the 
approach could be used also for the study of different other more general " correlation" 
measures if instead of the portfolio standard deviation some other more sophisticated 
risk measure (e.g. Conditional Value-at-Risk) is used. 
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